Search CORE

92 research outputs found

SiZer for time series: A new approach to the analysis of trends

Author: Marron J. S.
Park Cheolwoo
Rondonotti Vitaliana
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

Smoothing methods and SiZer are a useful statistical tool for discovering statistically significant structure in data. Based on scale space ideas originally developed in the computer vision literature, SiZer (SIgnificant ZERo crossing of the derivatives) is a graphical device to assess which observed features are `really there' and which are just spurious sampling artifacts. In this paper, we develop SiZer like ideas in time series analysis to address the important issue of significance of trends. This is not a straightforward extension, since one data set does not contain the information needed to distinguish `trend' from `dependence'. A new visualization is proposed, which shows the statistician the range of trade-offs that are available. Simulation and real data results illustrate the effectiveness of the method.Comment: Published at http://dx.doi.org/10.1214/07-EJS006 in the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Carolina Digital Repository

Analysis of dependence among size, rate and duration in internet flows

Author: Hernández-Campos Felix
Jeffay Kevin
Marron J. S.
Park Cheolwoo
Smith F. Donelson
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

In this paper we examine rigorously the evidence for dependence among data size, transfer rate and duration in Internet flows. We emphasize two statistical approaches for studying dependence, including Pearson's correlation coefficient and the extremal dependence analysis method. We apply these methods to large data sets of packet traces from three networks. Our major results show that Pearson's correlation coefficients between size and duration are much smaller than one might expect. We also find that correlation coefficients between size and rate are generally small and can be strongly affected by applying thresholds to size or duration. Based on Transmission Control Protocol connection startup mechanisms, we argue that thresholds on size should be more useful than thresholds on duration in the analysis of correlations. Using extremal dependence analysis, we draw a similar conclusion, finding remarkable independence for extremal values of size and rate.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS268 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Carolina Digital Repository

Support vector machines with adaptive penalty

Author: Ahn Jeongyoun
Liu Yufeng
Park Cheolwoo
Zhang Hao Helen
Publication venue
Publication date: 01/01/2007
Field of study

The standard Support Vector Machine (SVM) minimizes the hinge loss function subject to the L2 penalty or the roughness penalty. Recently, the L1 SVM was suggested for variable selection by producing sparse solutions (Bradley and Mangasarian, 1998; Zhu et al., 2003). These learning methods are non-adaptive since their penalty forms are pre-determined before looking at data, and they often perform well only in a certain type of situation. For instance, the L2 SVM generally works well except when there are too many noise inputs, while the L1 SVM is more preferred in the presence of many noise variables. In this article we propose and explore an adaptive learning procedure called the Lq SVM, where the best q > 0 is automatically chosen by data. Both two- and multi-class classification problems are considered. We show that the new adaptive approach combines the benefit of a class of non-adaptive procedures and gives the best performance of this class across a variety of situations. Moreover, we observe that the proposed Lq penalty is more robust to noise variables than the L1 and L2 penalties. An iterative algorithm is suggested to solve the Lq SVM efficiently. Simulations and real data applications support the effectiveness of the proposed procedure

Carolina Digital Repository

Multiscale Exploratory Analysis of Regression Quantiles Using Quantile SiZer

Author: Hannig Jan
Lee Thomas
Park Cheolwoo
Publication venue
Publication date: 01/01/2010
Field of study

The SiZer methodology proposed by Chaudhuri & Marron (1999) is a valuable tool for conducting exploratory data analysis. Since its inception different versions of SiZer have been proposed in the literature. Most of these SiZer variants are targeting the mean structure of the data, and are incapable of providing any information about the quantile composition of the data. To fill this need, this article proposes a quantile version of SiZer for the regression setting. By inspecting the SiZer maps produced by this new SiZer, real quantile structures hidden in a data set can be more effectively revealed, while at the same time spurious features can be filtered out. The utility of this quantile SiZer is illustrated via applications to both real data and simulated examples

Carolina Digital Repository

Support vector machines with adaptive Lq penalty

Author: Cheolwoo Park
Hao Helen Zhang
Jeongyoun Ahn
Yufeng Liu
Publication venue
Publication date
Field of study

The standard support vector machine (SVM) minimizes the hinge loss function subject to the L2 penalty or the roughness penalty. Recently, the L1 SVM was suggested for variable selection by producing sparse solutions [Bradley, P., Mangasarian, O., 1998

CiteSeerX

Visualization and inference based on wavelet coefficients, SiZer and SiNos

Author: Godtliebsen Fred
Marron J.S.
Park Cheolwoo
Stoev Stilian
Taqqu Murad
Publication venue
Publication date: 01/01/2007
Field of study

SiZer (SIgnificant ZERo crossing of the derivatives) and SiNos (SIgnificant NOnStationarities) are scale-space based visualization tools for statistical inference. They are used to discover meaningful structure in data through exploratory analysis involving statistical smoothing techniques. Wavelet methods have been successfully used to analyze various types of time series. In this paper, we propose a new time series analysis approach, which combines the wavelet analysis with the visualization tools SiZer and SiNos. We use certain functions of wavelet coefficients at different scales as inputs, and then apply SiZer or SiNos to highlight potential non-stationarities. We show that this new methodology can reveal hidden local non-stationary behavior of time series, that are otherwise difficult to detect

Carolina Digital Repository

Long-range dependence in a changing Internet traffic mix

Author: Hernández-Campos Félix
Marron J.S.
Park Cheolwoo
Smith F. Donelson
Publication venue
Publication date: 01/01/2005
Field of study

This paper provides a deep analysis of long-range dependence in a continually evolving Internet traffic mix by employing a number of recently developed statistical methods. Our study considers time-of-day, day-of-week, and cross-year variations in the traffic on an Internet link. Surprisingly large and consistent differences in the packet-count time series were observed between data from 2002 and 2003. A careful examination, based on stratifying the data according to protocol, revealed that the large difference was driven by a single UDP application that was not present in 2002. Another result was that the observed large differences between the two years showed up only in packet-count time series, and not in byte counts (while conventional wisdom suggests that these should be similar). We also found and analyzed several of the time series that exhibited more “bursty” characteristics than could be modeled as Fractional Gaussian Noise. The paper also shows how modern statistical tools can be used to study long-range dependence and non-stationarity in Internet traffic data

Carolina Digital Repository

Dependent SiZer: Goodness-of-Fit Tests for Time Series Models

Author: Marron J. S.
Park Cheolwoo
Rondonotti Vitaliana
Publication venue
Publication date: 01/01/2004
Field of study

In this paper, we extend SiZer (SIgnificant ZERo crossing of the derivatives) to dependent data for the purpose of goodness of fit tests for time series models. Dependent SiZer compares the observed data with a specific null model being tested by adjusting the statistical inference using an assumed autocovariance function. This new approach uses a SiZer type visualization to flag statistically significant differences between the data and a given null model. The power of this approach is demonstrated through some examples of time series of Internet traffic data. It is seen that such time series can have even more burstiness than is predicted by the popular, long range dependent, Fractional Gaussian Noise model

Carolina Digital Repository

Improved SiZer for time series

Author: Hannig Jan
Kang Kee-Hoon
Park Cheolwoo
Publication venue
Publication date: 01/01/2009
Field of study

Carolina Digital Repository

Experimental Investigation of the Effects of Concrete Alkalinity on Tensile Properties of Preheated Structural GFRP Rebar

Author: Cheolwoo Park
Do Young Moon
Hwasung Roh
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

The combined effects of preexposure to high temperature and alkalinity on the tensile performance of structural GFRP reinforcing bars are experimentally investigated. A total of 105 GFRP bar specimens are preexposed to high temperature between 120°C and 200°C and then immersed into pH of 12.6 alkaline solution for 100, 300, and 660 days. From the test results, the elastic modulus obtained at 300 immersion days is almost the same as those of 660 immersion days. For all alkali immersion days considered in the test, the preheated specimens provide slightly lower elastic modulus than the unpreheated specimens, showing only 8% maximum difference. The tensile strength decreases for all testing cases as the increase of the alkaline immersing time, regardless of the prehearing levels. The tensile strength of the preheated specimens is about 90% of the unpreheated specimen for 300 alkali immersion days. However, after 300 alkali immersion days the tensile strengths are almost identical to each other. Such results indicate that the tensile strength and elastic modulus of the structural GFRP reinforcing bars are closely related to alkali immersion days, not much related to the preheating levels. The specimens show a typical tensile failure around the preheated location

Crossref

Directory of Open Access Journals